Clustering Polysemic Subcategorization Frame Distributions Semantically

نویسندگان

  • Anna Korhonen
  • Yuval Krymolowski
  • Zvika Marx
چکیده

Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In contrast to previous work, we particularly focus on clustering polysemic verbs. A novel evaluation scheme is proposed which accounts for the effect of polysemy on the clusters, offering us a good insight into the potential and limitations of semantically classifying undisambiguated SCF data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inducing a Semantically Annotated Lexicon via EM-Based Clustering

We present a technique for automatic induction of slot annotations for subcategorization frames, based on induction of hidden classes in the EM framework of statistical estimation. The models are empirically evalutated by a general decision test. Induction of slot labeling for subcategorization frames is accomplished by a further application of EM, and applied experimentally on frame observatio...

متن کامل

Clustering Verbs Semantically According to their Alternation Behaviour

Verbs were clustered semantically on the basis of their alternation behaviour, as characterised by their syntactic subcategorisation frames extracted from maximum probability parses of a robust statistical parser, and completed by assigning WordNet classes as selectional preferences to the frame arguments. The clustering was achieved (a) iteratively by measuring the relative entropy between the...

متن کامل

Subcategorization acquisition

Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and probabilistic parsers would greatly benefit from accurate information concerning the relative likel...

متن کامل

Detecting Dependencies Between Semantic Verb Subclasses And Subcategorization Frames In Text Corpora

There is a widespread belief among linguists that a predicate's subcategorization frames are largely determined by its lexical-semantic properties [23, 11, 12]. Consider the domain of movement verbs. Following Talmy [23], these can he semantically classified with reference to the meaning components: MOTION, MANNER, CAUSATION, THEME (MOVING ENTITY), PATH AND REFERENCE LOCATIONS (GOAL, SOURCE). L...

متن کامل

A Corpus-based Conceptual Clustering Method for Verb Frames and Ontology Acquisition

We describe in this paper the ML system, ASIUM, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requiring subcategorization frames and ontologies are crucial and numerous. The most direct applications ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003